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Abstract — Several authors have studied stego-systems based 
on Costa scheme, but just a few ones gave both theoretical and 
experimental justifications of these schemes performance in an 
active warden context. We provide in this paper a steganographic 
and comparative study of three informed stego-systems in active 
warden context: scalar Costa scheme, trellis-coded quantization 
and spread transform scalar Costa scheme. By leading on 
analytical formulations and on experimental evaluations, we show 
the advantages and limits of each scheme in term of statistical 
undetectability and capacity in the case of active warden. Such 
as the undetectability is given by the distance between the stego- 
signal and the cover distance. It is measured by the Kullback- 
Leibler distance. 



Introduction 

In data hiding, a very old field named steganography is 
used since the Antiquity. As defined by Cox et al. jjjj, 
steganography denotes "the practice of undetectability altering 
a work to embed a message". In the classical problem of 
the prisoners |2|, Alice and Bob are in prison and try to 
escape. They can exchange documents, but these documents 
are controlled by an active warden named Wendy. Cox [1] 
defines the warden as active when "she intentionally modifies 
the content sent by Alice prior to receipt by Bob". These 
modifications can slightly modify the content and degrade 
the hidden information. In this work, we consider that all 
modifications performed by Wendy are modeled by an Ad- 
ditive White Gaussian Noise (AWGN) and we propose to 
study the limits of such systems. Since our specific active 
warden context is similar to the case of watermarking with 
AWGN channel, we propose to study the capacity according 
to the Shannon definition [1] as the maximum information 
bits that can be embedded in one sample subject to certain 
level of the active warden attack (an AWGN attack in this 
case). In sequel, we evaluate the statistical undetectability by 
the KuUback-Leibler Distance (KLD) between the probability 
density functions (p.d.f ) of the stego-signal and the cover- 
signal, since the warden detects the message by comparing 
the stego-document probability density function with that of 
the cover-document. In [|7|, author used KLD to evaluate the 
security of stego-systems in the context of the passive warden. 



In this work, Cachin's security criterion is not used since the 
context is different (active warden context). 

We propose here to base our comparative study on 
informed data hiding schemes as the Scalar costa scheme 
(SCS). One of the major work already proposed on these 
type of scheme by Guillon et al. |3| experimentally found 
that SCS is statistically detectable due to artifacts in the 
p.d.f. of the stego-signal. The way proposed to make it 
undetectable is the use of a specific compressor on the 
signal leads to a less flexible scheme. Le Guelvouit (4\ 
proposed to use Trellis-Coded Quantization (TCQ) in order 
to hide the message: the author shows experimentally that 
the p.d.f. of the stego-signal is not affected by the embedded 
message. We fully complete this study and also theoretically 
demonstrate this result. Moreover, we propose in this work 
an evaluation of steganographic performance in an active 
warden context of the Spread Transform Scalar Costa Scheme 
(ST-SCS) 111, which is often use for robust watermarking. 
We demonstrate with experiments and analytic formulations 
the good statistical undetectability level of this system, then 
we compare its capacity and the compromise between the 
capacity and the statistical undetectability with other systems. 

Let us first list some notational conventions used in this 
paper Vectors are notes in bold font and sets in black board 
font. Data are written in small letters, and random variables 
in capital ones; s[i] is the i* component of vector s. The 
probability density function of random variable S is denoted 

by ps(-)- 

I. Analysis of scalar Costa scheme 

Eggers et al. fS] have introduced a sub-optimal scheme 
based on the Costa's ideas f6l. The authors propose to con- 
struct a codebook from the reconstruction points of a scalar 
quantizer. This approach is called Scalar Costa Scheme (SCS) 
and has a high capacity for optimal value of Costa's factor a. 
However, it has been shown |4| that the regular partitioning 
of scalar quantizers generates many artifacts in the p.d.f. of 
the marked signal. 
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Fig. 1. (a) Probability density functions of the host and marked signal using 
SCS for document to watermark ratio equal to 13 dB with a = 0.3 and (b) 
the probability density function of stego-signal with Guillon et al. scheme 
and the original cover- signal. 
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Fig. 2. Asymmetric steganography scheme: the permanent phase is initialized 
with a temporary private key k. 



For convenience, u*[i], in[i] and x[i] are denoted respec- 
tively as u, m and x in this section. If the information bits are 
equiprobable, then (see appendix |V-A| i: 

(1) 

where 1[ ] represents an unit window function. In this case, 
the distance between the reconstruction points of the two 
quantizers is equal to A/2, and then any window function 
recover the nearest ones if (1 — a)A/2 > A/4 (which is 
equivalent to a < 1/2); and for a > 1 /2 the window functions 
are separated. This explains the aliasing in the p.d.f. -for 
a = 0.3 - of the host signal in Fig. [TJa)- 

For a — 1/2, there are no holes and no aliasing but we 
obtain a continuous p.d.f only if px{u/2) = px(u/2 + A/4). 
The last equality is satisfied only if the p.d.f. is uniform. 

The observed discontinuities lead to a statistical detectable 
embedding. In the next part, we propose to study an improved 
scheme based on SCS. 

A. Improvement of SCS: Guillon et al. scheme 

By learning from Anderson and Petitcolas's work fS^, Guil- 
lon et al. 13] proposed a practical scheme of steganography 
with public key using asymmetric cryptography and SCS. 
Fig. [2] summarizes the two phases of this scheme. In the 
initialization phase, a private key k is generated with a pseudo- 
random generator and is encrypted with an asymmetric cypher 
algorithm. The key C(k, kpub) - where kpub is a public key 
known by all users - is embedded on the cover-signal. The 
permanent phase uses the transmitted key k and SCS to embed 
and transmit the message m. 

In the permanent phase, the statistical undetectability is 
mainly assured by the private key, since it leads to a non 
distorted p.d.f. However, the initialization phase requires the 
transmission of public information without distorting the 
stego-signal. Guillon et al. proposed to use SCS with a = 1/2 
in order to hide an invisible (statistically and perceptually) 
message, but it is only valid for a cover-signal with uniform 



p.d.f.; they then proposed to use a compressor before em- 
bedding in order to equalize the p.d.f. of cover-content. The 
embedded message will be statistically invisible, as shown 
in Fig. [ijb). Unfortunately, the resulting stego-system is less 
flexible, because the encoding and decoding steps highly 
depend on the statistics of cover-content. It has been recently 
shown |i4J that the artifacts in the stego-signal are due to the 
use a regular partitioning codebook. In the next section, we 
propose to use a structured codebook by the way of TCQ. 

II. Analysis of the trellis-coded quantization 

The approach proposed here concerns the use of a trelhs- 
based quantization, for a pseudo-random partitioning of the 
codebooks, in order to avoid the artifacts introduced in the 
p.d.f. of stego-content by regular partitioning (as observed in 
the previous stego-system). 

A. Principles 

Let us consider a trellis defined by a transition function: 
£ X {0, 1} — > £, tr : (eH,in[i]) i — > e[i + 1], with 
8 — {0, 1, ... , 2''^^} groups of possible states, where r is 
an integer such as r > 1, and i is the index of current 
transition. Contrary to the SCS, the dithering d will not be 
random but will become a function of the current state and of 
the embedded symbol: 

£x{0,l} — . hA/2,+A/2], 
/:(eH,m[z]) ^ d^. (2) 

In this stego-system, the codebooks are defined by 

Ura[i] = {nA + /(eH,m[z]), n e Z} , 

and the closest codeword u* G Ura to s[i] is calculated using a 
Viterbi algorithm |9|, with a high a priori in order to be sure 
that the obtained codeword belongs to Usa'. 

G 

u* = argmin V(s[j]-ub1)^- (3) 
ueUm : 

The stego-signal is given by: 

X = s + a (u* - s) , (4) 
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Fig. 3. Probability density functions of the cover and stego-signal for 
document to watermark raio equal to 13 dB by using TCQ with different 
value of a: (a) a = 0.3 and (b) a = 0.7. 



where s is the cover-signal and a represents the Costa's 
parameter. 

To extract the embedded message, we have to apply the Viterbi 
algorithm in order to retrieve the path which corresponds to 
the stego-signal. 

B. Statistical analysis of TCQ 

In order to theoretically justify the use of the TCQ to 
get statistical invisibility, we have calculated the p.d.f. (see 
appendix |V-B[ ). We obtain: 



Pxix) 



1 



ps{z) Az, 



(5) 



where aw is the standard deviation of the embedded signal. 
Then px is the mean p.d.f. for the cover signal in the interval 
centered on x and a width awV^- We have implemented 
Eqn. (j5]l for a signal with Gaussian p.d.f and we obtained the 
results presented on Fig.[3ja) and[3jb). We can notice the good 
match between the p.d.f. obtained with the TCQ algorithm 
(experimental), the theoretical versions and the original ones 
for the same high embedding power 

However, Fig. |4|a) shows that the capacity of TCQ is not 
as good as that of SCS. Then, we can use the TCQ only in 
the initialization phase - of the previous scheme (Fig. |2| -, 
because this phase requires just a limited payload. 

III. Analysis of the spread transform scalar 
Costa scheme 

We propose to use the ST system which allows any stego- 
system to increase its Watermark-to-Noise Ratio (WNR) |5| 
and improve the resistance against active warden (who per- 
forms an AWGN attack in order to remove the stego-message). 

A. Spread transform 

Chen and Wornel fTOl introduced a general approach for 
robust watermarking applications. It allows to spread the 
embedded message on several cover samples. They proposed 
to hide the message in a transformed domain |5|. In sequel, the 
spreading parameter is modeled by a realizations set of random 
variables with uniform p.d.f. To extract the hidden message, 
an inverse transformation is applied to a resulted signal. 
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Fig. 4. (a) The capacity of stego-systems SCS, TCQ and ST-SCS as function 
of watermark to noise ratio and (b) the differentiation of the function kld(Q:) 
with respect to the parameter a, in the case of ST-SCS stego-system with 
r = 2. 



In 15], authors studied especially the robustness of this 
system to applied it to the robust watermarking. In this 
work, we study the steganographic performance of the spread 
transform system in active warden context. We note that, 
before transmitted the information, the spread transform makes 
an inverse transformation where the embedded signal strength 
is divided by the spreading factor r, then DWR = DWR,- + 
10 logio ^^^^ as DWR is the Document-to- Watermark Ratio 
and DWRt- is the Document-to-transformed Watermark Ratio. 
Thus, spread transform improves the perceptual invisibility of 
any hiding system. 

B. Statistical analysis of ST-SCS 

In sequel, we focus only on the combination of the spread 
transform with the SCS-based stego-system in active warden 
context. In order to evaluate the statistical undetectability of 
the stego-system, we develop a theoretical formulation of ST- 



SCS stego-signal density (see appendix V-C i: 

Px{x) 



E 



4(r - a) 
S \ Qa 



{x + ayt — aut) t + y 



{x + ayt — aut) ) pviy) dy. 




(c) (d) 

Fig. 5. Probability density functions of the cover and stego-signal by using 
ST-SCS for r = 2 and document to watermark ratio equal to 13 dB with 
different value of a: (a) a = 0.3 and (b) a = 0.7; for r = 10 with (c) 
a = 0.3 and (d) a = 0.7. 



In Fig. [5] the experimental p.d.f. of the stego-signal validates 
the theoretic model given by Eqn. |6] because we can see that 
the theoretic p.d.f. follows the experimental one. 

If we replace t with its two possible realizations, i.e. 
zLl/y/r, and we take t oo with finite ct^ (the variance 
of cover-signal s) then: 

Px{x) = J 5{u- QA{y))ps{x)pYiy) dy 

+jYl I H"^- QA{y))ps{x)PY{y) dy. 

So the stego-signal x has the same density as the cover- 
signal - in this case the two p.d.f. are both Gaussian. However, 
Fig. Qb) shows that the differentiation of the KLD by respect 
to a is always negative and converges speedily to zero even 
for T — 2, then the KLD takes - theoretically - its minimal 
value for the majority values of the parameter a and for any 
value of the spreading factor r. In addition, experiences show 
that the stego-signal has the same p.d.f. than the cover-signal 
even for a small value of spreading factor r (see Fig. |5]l. We 
can see on Fig. [6]; a). Fig. 0;a) and Fig. [Tjb) that ST-SCS has 
the same level of the statistical undetectability as TCQ stego- 
system, but better than the undectability level of the SCS. 

C. Performance of ST-SCS 

Fig. [4ja) shows that for strength warden attack (low WNR), 
the capacity of ST-SCS is better than the one of TCQ. 
In the contrary, for high WNR values, the capacity of the 
TCQ is better. As a result, it is very difficult to have a 
system which permits a good invisibility and in the same 




(b) 

Fig. 6. (a) The Kullback-Leibler distance for SCS, TCQ and ST-SCS stego- 
systems with Gaussian images as function of DWR; (b) capacity vs. Kullback- 
Leibler distance for SCS, TCQ and ST-SCS stego-systems with Gaussian 
images such that WNR £ [—20, 12] dB and document-to-watermark ratio 

e [0,40] dB. 



time a good capacity; then the compromise between these 
two characteristics becomes important. Fig. [6][b) shows that 
the compromise of ST-SCS is the best in comparison to the 
SCS and the TCQ stego-systems in active warden context. 

We have appUed SCS, TCQ-based scheme and ST-SCS to 
100 real images with 350 x 350 pixels size. Fig. |8] confirms 
the results obtained for Gaussian images, where the ST-SCS 
has the same undetectability level as TCQ and better than 
SCS. However, the statistical undetectability will be the same 
as SCS in transformed domain if the projection parameter is 
public. 

In the case of public key steganography (Fig. |2|i, we can use 
the TCQ stego-system in the initialization phase, to transmit 
the secret key, and the ST-SCS in the permanent phase, 
which allows to the best compromise between statistical 
undetectability and capacity. 

Conclusion and Perspectives 

In this work, we have compared the steganographic per- 
formance of several informed-based stego-systems in active 
warden context. For each system, the experimental results 
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Fig. 7. (a) The Kullback-Leibler distance of ST-SCS as function of r for 
different value of a and (b) the Kullback-Leibler distance as function of a 
for different value of t. 




Fig. 8. KLD vs. document to watermark ratio with 100 real images of size 
350 X 350. 



have been used to validate the theoretical model. For SCS, 
the stego-signal is regularly partitioned, thus, many artifacts 
in the p.d.f. of the stego-signal are introduced, which is 
also proved by the developed theoretical formulations. Due 
to this observations, we have proposed an analysis of two 
another systems. The first one is based on a pseudo-random 
partitioning (the TCQ-based system), which allows to obtain 
a more common and undetectable public stego-system (the 
technique does not depend to the cover-signal distribution). 
The second one is based on the combination of SCS with 



spread transform (the ST-SCS), which allows a good statistical 
undetectability and a best compromise between capacity and 
undetectability. In future work, we shall study an improvement 
of the undetectability with combination of ST and TCQ when 
the projection parameter is public. We shall also verify our 
theoretical models by an applications on real images. 

IV. Acknowledgment 

The authors would like to thank Professor Pierre Duhamel 
for his help and collaboration to this paper and the ESTIVALE 
project from ANR (French national agency of research) for 
funding. 

V. Appendix 
A. Demonstration of Eqn. ([7J 

We model the stego-signal by a realizations set of Gaussian 
random variables, independent and non stationary: X = 
. . . ,X[G]}. It is given by the following equation (in 
sequel, we do not use the index of the variable for ease of 
presentation): 

X={l-a)S + aU, (6) 

where a represents the Costa's optimization parameter and 
a cover-signal is modeled by a realizations set of Gaussian 
random variables, independents and non stationary: S = 
. . . , 5'[G]}. According to the product rule 



p(s|m, m) — 



p{u\s,m)ps{s) 
p{u\m) 



we have: 



p{u\s, m) = S{u - Qa{s)), 



(7) 



where Qa{-) represents a scalar quantizer with step A. In the 
other hand, 

p{s\m) = ^p{s\u,'m)p{u\m) = ^S{u - Qa{s))ps{s). 

u u 

If we replace S = ^r"^ in the last equation, we obtain 



a 



Ps 



a 



When the information bits are equiprobable, we write: 



Px{x) = 2(1^- a) H I " " 



X — au 
1 - a 



PS 



a 



B. Demonstration of Eqn. Q 

We note e[i] -for i — 1, . . . ^ N - the trellis states and we 
suppose that all these states follow an uniform distribution 
such as : psie) — l/N. In TCQ-based stego-system, we 
substitute the cover-samples by U(^n,m.e), n G Z, the code- 
word of sub-codebook which corresponds to the state e and 
message-bit m. It is given by U(^n,m..e[i]) — {n + ni/2 — i/N)A 
for i = l,...,N/2 and U^^n.mMi]) = C^n,m,e[«-Af/2] for 



i — N/2 + 1, . . . , N. By leading on appendix V-A the p.d.f. 



formulation of TCQ stego-signal for a fixed state e is : 

P{x\e) = r V Ir 1 ^l(a; - W(„,m,e)) 

2(1 — a] V 2(l-o)'2{l-a)J \^ 



xps 



X Q^'W(n,m,e) 

1 - a 



(8) 



and 



N 



Px{x) = ^px{x\^[i])PE{^[i]) 

i=l 
1 1 

1 - a) ^ ^ ^["2(i-„) '2(i-c)] 



{I -a) 
xps 



W)) 



Now, we compute the p.d.f of the codeword U conditionally 

to S, Y, T and the message m: 

p{u\s,y,t,m) = 6{u~Q^{st + y)), (13) 
where 5 represents the Kronecker symbol. Therefore 

M.Ky,i,m)='^(^i^^4^^±2'))#^. (14) 

p{u\y,t,m) 

In this work, we consider 5 as a random variable independent 
of T and Y . Therefore p{s\y, t, m) — p{s) and 

p{u\y,t,m} 
Now, we make the following variable change: 



n.m i—l 

X — aui 



^(n,Tn,e[i]) 



1 - a 



(9) 



if the number of states is large and by leading on the properties 
of the Riemann sum, then: 



Px{x) 




_j ^1 {x-{n+'^- i) A) 

(l-o) ■2(l-c)J V 2 J 



S = iX + aT~ aUT). 

T ~ a 

Then, we obtain 

p{x\u,y,t,m) = 

T — a 

S (u-Qa (x + ayt - aut)j t + y^ 

p{u\y,t,m) 



xps I {x + y — aut) 

T — a 



(16) 



(17) 



X — a\n- 



-7) A 



1 — a 



d7. 



(10) 



If we replace m by its two possible values, i.e. or 1, and 

"' we 

-a ' 



make the following variable change Z — ^^j^^^, we obtain: 



Since T is a random variable which the realizations take 
just two values ±l/y^, and since m is also considered as 
equiprobable, the marginaUzation over this two variables and 
over U and y gives: 

Px(x) 



Px{x) 



1 



ps{z) dz 



1 



psiz) dz. 



E 



aA y^_a^ ' a.uiVl2 J x-a^Vs 

C. Demonstration of Eqn. ([6]) 

The transformation of the cover-signal is modeled by a re- 
alizations set of Gaussian random variables, independents and 
non stationary, i.e. S'' = . . . , S''[G/t]}. In addition, 

we take the spreading direction t such as Vi, t[i] = ±^ 
and it is modeled by a set of Gaussian, independents and non 
stationary random variables, i.e. T = . . . , r[A^]}. Then, 

when the ST-SCS is used to embed the message, the stego- 
signal X is given by A = 5 + a{U — S^^)T, if we consider: 

Ti + T-l 

Sf= J2 S[i]xT[^^S[n]xT[n] + J2S[^xT[i], 

i=rl i^n 

v ' 

where Y is considered as a random variable modeled by a set 
3^={ri[l],...,yG[G/T]}, then 



4 (t - a) 
5\u- Qa 



..t-'y 



{x + ayt — aut) t + y 



xps 



(x + ayt — aut) ) pyijj) dy. 
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